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Abstract 

This paper considers the issue of modehng fractional data observed on [0,1), (0,1] or [0,1]. 
Mixed continuous-discrete distributions are proposed. The beta distribution is used to 
describe the continuous component of the model since its density can have quite different 
shapes depending on the values of the two parameters that index the distribution. Prop- 
erties of the proposed distributions are examined. Also, estimation based on maximum 
likelihood and conditional moments is discussed. Finally, practical applications that em- 
ploy real data are presented. 

Keywords and Phrases: Beta distribution; inflated beta distribution; fractional data; 
maximum likelihood estimation; conditional moments; mixture; proportions. 



1 Introduction 

Many studies in different areas involve data in the form of fractions, rates or proportions that 
are measured continuously in the open interval (0, 1). However, frequently the data contain 
zeros and/or ones. In such cases, continuous distributions are not suitable for modeling the 
data. In this work, we propose mixed continuous-discrete distributions to model data that are 
observed on [0,1), (0,1] or [0,1]. The proposed distributions capture the probability mass at 
0, at 1 or both, depending on the case. For data observed on [0, 1) or (0, 1] we use a mixture 
of a continuous distribution on (0, 1) and a degenerate distribution that assigns non-negative 
probability to or 1, depending on the case. If the response variable is observed on the 
closed interval [0, 1] we use a mixture of a continuous distribution on (0, 1) and the Bernoulli 
distribution, which gives non-negative probabilities to and 1. These models are special cases 
of the class of inflated models. The word inflated suggests that the probability mass of some 
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points exceeds what is allowed by the proposed model (Tu, 2002). Some related works include 
Aitchison (1955), Feuerverger (1979), Yoo (2004), Heller, Stasinopoulos k Rigby (2006), 
Cook, Kieschnick & McCullough (2004) and Lesaffre, Rizopoulus & Tsonaka (2007). 

The paper unfolds as follows. Section 2 presents the zero- and one-inflated beta distri- 
butions and discusses some of their properties. Estimation based on maximum likelihood 
and conditional moments is presented. Section 3 introduces the zero-and-one-inflated beta 
distribution, some of its properties and estimation based on maximum likelihood and con- 
ditional moments. In Section 4, Monte Carlo simulation studies are carried out to examine 
the performance of the proposed estimators. Section 5 contains applications of the proposed 
distributions and Tobit models to real data. For all the applications the inflated beta distri- 
butions fitted the data better. Section 6 closes the paper with concluding remarks. 



2 Zero- and one-inflated beta distributions 

The beta distribution is very flexible for modeling data that are measured in a continu- 
ous scale on the open interval (0, 1) since its density has quite different shapes depending 
on the values of the two parameters that index the distribution; see Johnson, Kotz & Bal- 
akrishnan (1995, Chapter 25, Section 1), Kieschnick &: McCullough (2003) and Ferrari and 
Cribari-Neto (2004). The beta distribution with parameters fx and (p {0 < fj, < 1 and (p > 0), 
denoted by B{fi,(l)), has density function 

where r(-) is the gamma function. If y ~ •S(/i, (/>), then E(?/) = /i and Var(?/) = V(/i)/((/) -|- 1), 
where V(^) = /u(l — n) denotes the "variance function". The parameter (f) plays the role of a 
precision parameter in the sense that, for fixed the larger the value of (p, the smaller the 
variance of y. Different values of the parameters generate different shapes of the beta density 
(unimodal, 'C/', 'J', inverted 'J', uniform). 

In practical applications the data may include zeros and/or ones. The beta distribution 
is not suitable for modeling the data in these situations. If the data set contains zeros or 
ones (but not both) its is natural to model the data using a mixture of two distributions: a 
beta distribution and a degenerate distribution in a known value c, where c = or c = 1, 
depending on the case. The cumulative distribution function of the mixture distribution is 
given by 

Blc{y;a,fi,4>) = Q]l[c,i](y) + (1 - a)F{y; ^,4)), 

where llA(y) is an indicator function that equals 1 if y G ^ and if y ^ vl. Here, F{-; fi, (p) 
is the cumulative distribution function of the beta distribution B{fj,,(p) and < a < 1 is 
the mixture parameter. The corresponding probability density function with respect to the 
measure generated by the mixturqj is given by 

hiJy.a, 12,6) = < ' ' U ^ /•2"\ 

l(i-«)/(y;M,</'), ifyG(o,i), ^ ^ 



^The probability measure P corresponding to BIc(j/; •), defined over the measurable space ((0, 1) U {c}, 23) 
where 25 is the class of all Borelian subsets of (0, 1) U {c}, is such that P « A + Sc, with A representing the 
Lebesgue measure and 5c is a point mass at c, i.e. Sc{A) — 1, if c £ A and 5c{A) = 0, if c ^ A, A £ 
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where f{y; /i, (p) is the beta density ([T|). Note that a is the probabihty mass at c and represents 
the probabihty of observing (c = 0) or 1 (c = 1). 

Definition 2.1. Let y be a random variable that follows the inflated beta distribution 

1. If c = 0, distribution ([2]) is called zero-inflated beta distribution (BEZI) and we write 
y BEZI(a, /i, (/>). 

2. 7/ c = 1, distribution ([2]) is called one-inflated beta distribution (BEOI) and we write 
y ~ BEOI(a,;U,(/>). 

If y ~ BEZI(q!, /X, (/)), then a = P{y = 0) and if y ~ BEOI(a, ;U, then a = P{y = 1). 
Hence, those distributions ahow us to include a mass point at or 1 in the beta distribution 

The rth moment of y and its variance can be written as 

E(y'") = ac + (1 - a)//,., r = l,2,..., 
Var(y) = (1 - + a(l - a)(c - ^^f , 

where = (r) / i^P) (r) , with a(^r) = + 1) • • • (a + r — 1), is the rth moment of the beta 
distribution ([T|). Note that E(y'') is the weighted average of the rth moment of the degenerate 
distribution at c and the corresponding moment of the beta distribution B{fj,, (p) with weights 
a and 1 — a, respectively. In particular, E(?/) = ac + (1 — a)//. 

Figure [J presents BEZI densities for different choices of ^ and (j) with fixed a. Note that 
for all /i and (p the BEZI distribution is asymmetrical because of the probability mass at 0. 
Also, the BEZI density may be unimodal and may have 'J', '[/', inverted 'J' and uniform 
shapes. In these graphs, the vertical bar with the circle above represents a = P{y = 0). 
Similarly, the BEOI distribution is asymmetrical because of the probability mass at 1 and, 
for identical choices of the parameters, the BEZI and BEOI distributions have the same 
functional shape on the interval (0, 1). However, they differ in the mass point, being at for 
the BEZI distribution and at 1 for the BEOI distribution. 

Proposition 2.1. The zero- and one-inflated beta distributions are three-parameter exponen- 
tial family distributions of full rank. 

Proof Let r? = (r/i,?72,%), with rji = [log(a/(l - a)) + 5(?72, 773)], r/2 = iJ^fp and r/3 = {l-iJ,)(p, 
where 5(r/2,??3) = log(r(?72)r(%)/r(?72 + %))• Let T{y) = {ti{y),t2{y),t3{y)), where ti{y) = 
^{c}{y), hiy) = logy if y G (0, 1) and if y = c and t^iy) = log(l - y) if y G (0, 1) and if 
y = c. Note that density ([2j) can be written as 

eMv'^T{y)-B*{7])}h{y), (4) 

where B*{j]) = log{l + exp[r/i — i?(r/2, r/3)]} + B{rj2,rj3) is a real-valued function of rj and 
/i(y) = l/{y(l — y)} if y G (0,1) and 1 otherwise is a positive function defined over the 
set (0, 1) U {c}. The parameterization t] defines a one-to-one transformation which maps 
X = {(a, fi, (p) '■ {a, A*, (p) G (0, 1) x (0, 1) x iR+j onto D = iR+ x x M, i.e., the Jacobian of 
the transformation is nonzero for all G 2), an open subset of iR^. Additionally, neither the 
fs nor the ry's satisfy linear constraints and the parameter space contains a three-dimensional 
rectangle. Therefore, ^ is the canonical representation of the inflated beta distribution in 
the three-parameter exponential family of full rank. □ 
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Figure 1: BEZI densities for different values of // and (/>; a = 0.4. 



Let yi,...,yn be n independent random variables, where each yt has density ([2]). A 
consequence of Proposition 12.11 is that Yl^=iT{yt) = {Ti,T2,Ts), with Ti = Ylt=i^{c}iyt)^ 
^2 = Et:j;ig(o,i) ^"^^ ^3 = Et:j/te(o,i) -Vt)^ is & Complete (minimal) sufficient 

statistic (Lehmann & Casella, 1998, Corollary 1.6.16 and Theorem 1.6.22). 

The likelihood function for 6 = (a, ^u, 0) given the sample (yi, . . . , ?/„) is 

n 

L{0) = J|bic(yt;a,/x,(/)) = Li{a)L2{fi, (/)), 
t=i 

where 

n 

Li(a) = ]Ja^w(2'*)(l - a)i-^{-}(2'') = a^i(l - a)"-^S 
t=i 

n 

L2(.fi,(l^) = Ylf(.yt;f^Ay~''^^^^''^. 
t=i 

The likelihood function L{6) factorizes in two terms; the first term depends only on a and 
the second, only on {n,(p). Hence, the parameters are separable (Pace &: Salvan, 1997, p. 
128) and maximum likelihood inference for (/i, (p) can be performed separately from that for 
a, as if the value of a were known, and vice-versa. 

The log-likelihood function for the inflated beta distribution ([2]) is given by 

m = iog{m) = h{a) + £2{fi,(P), 
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where 

ii{a) = riloga + (n-ri)log(l - a), 

U,, « = (~ - TO log + - 1) 

+ T3((1-m)*-1)^ 

The score function obtained by differentiating the log-hkehhood function with respect to the 
unknown parameters is {Uaia),Ufj,{n, (p),Utj){fi, (p)), where ^/^(a) = Ti/a — {n — Ti) /{I — a), 
U^ifi,^) = Hin - - ^)</') - + T2 - T3} and U^fi,^) = {n - ri)[^(</>) - 

fj.ip{fj.cj)) — (1 — fj,)7p{{l — fj.)(j))] + — T3{1 — The maximum hkeUhood (ML) estimator of 
a is a = Ti/n and represents the proportion of zeros (c = 0) or ones (c = 1) in the sample. 
Since 3 is a function of a complete sufficient statistic and is an unbiased estimator of a, it 
is the uniformly minimum variance unbiased estimator (UMVUE) of a (Lehmann & Casella, 
1998, Theorem 2.1.11); its variance is given by Var(a) = a{l — a)/n. The maximum likelihood 
estimators of fi and (p are obtained from the equations U^{fj,,(p) = and U(j,{fj,,(p) = 0, and 
do not have closed form. They can be obtained by numerically maximizing the log-likelihood 
function £2(M) 4*) using a nonlinear optimization algorithm, such as a Newton algorithm or a 
quasi-Newton algorithm; for details, see Nocedal & Wright (1999). Recently, the BEZI and 
BEOI distributions were incorporated into the gamlss.dist package in R (Ospina, 2006). 

We can obtain estimators for (/i, (f>) based on conditional moments of y given that y G 
(0,1), which do not depend on a. Observe that E(y | y £ (0,1)) = fj. and Var(y | y £ 
(0, 1)) = /i(l — + 1). For Ti < nH, the solution of the system of equations {y,s'^)~^ = 

in, /x(l - /i)/(0 + 1))^, with y = T.t:yte{o,i) Vt/i^ " ^0 and = Et:j/tG(o,i) - ^1): 

gives the following closed-form estimators for ^ and (j): Jl = y and (j) = {Jj.{l — Ji)/ s^} — 1. 
Closed-form estimators of E(y'') and Var(y) can be obtained by replacing a, /i and by a, 
Jl and in ([3]). 

The Fisher information matrix for the inflated beta distribution ([2]) is 

(Kaa \ 
(5) 
K-cfifj, K.4,4, J 

where = l/{a(l - «)}, i^-fif, = (1 - a) (/"</>) + '^'((l - ^A4>)}, i^pi<t> = Ht^ = (1 - 
a)<A{V'(/i<A)/^-V''((l-/i)0)(l-/i)} and = (l-a){^i''i,'{^i4>) + {l-^ifi^\{l-^J)(f>)-i^'m■ 
Note that K{9) does not depend on c. Also, a and (/x, (p) are globally orthogonal and 
hence the corresponding components of the score vector are uncorrelated. Since the inflated 
beta distribution ([2]) belongs to an exponential family of full rank (see Proposition ()2.ip l. it 

follows that ^{9 -9) ^ Ms^O, K{9)-^), with K{9) given in ([5]) and that a and {p, (p) are 
asymptotically independent. 

If the interest lies in estimating a function of 9, r{9) say, the delta method (Lehmann 
& Casella 1998, § 1.9) can be used to obtain the asymptotic distribution of r{9), the ML 

estimator of r{9). lfr{9) is differentiable, then \/n{r{9) — r{9)) — > AA(0, A(^)), where A(^) = 
f{9)^K{9)~^r{9) with r{9) = dr{9)/d9. In particular, the maximum likelihood estimator of 
E(?/) = ac + (1 — a)^ is ac + (1 — a)/i and the variance of its normal limiting distribution is 
(c - /x)2k<^" + (1 - a)^K^'', where k"" = l/K^a and K^'^' is the (2, 2)-element of K{9Y^. In a 
similar fashion, ML estimation of Var(?/) can be performed. 



liTi — n (all observations equal c) the BEZI and the BEOI distributions are not recommended. 
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3 The zero-and-one-inflated beta distribution 



The zero- and one-inflated distributions presented in Section 2 are not suitable for modeling 
fractional data that contain both zeros and ones. For this situation, we propose a mixture 
between a beta distribution and a Bernoulli distribution. Specifically, we assume that the 
cumulative distribution function of the random variable y is 

BEINF(y; a, 7, /x, </>) = aBer(y; 7) + (1 - a)F{y; //, </.), 

where Ber(-; 7) represents the cumulative distribution function of a Bernoulli random variable 
with parameter 7 and F{-;^,(j)) is the cumulative distribution function of B{fi,(j)). Here, 
< ;U, 7, a < 1 and cj) > 0, a being the mixture parameter. 

Definition 3.1. Let y be a random variable that assumes values in the closed interval [0,1]. 
We say that y has a zero-and-one-inflated beta distribution (BEINF) with parameters a, 7, 
/i and (p if its density function with respect to the measure generated by the mixtur^ is given 
by 

0(1-7), if y = 0, 

07, if y = 1, (6) 

y{l-a)f{y■,^l,(t>), if ye (0,1), 



beinf(y;a,7,/z, 



with < a,7,;U < 1 and > 0, where f{y;^,(l)) the beta density function ([T|). We write 
y ~ BEINF(a,7,/i,(?;)). Note that, if y BEINF(a, 7, ^, 0), then P{y = 0) = a(l - 7) and 
P{y = 1) = "7- 

After some algebra, the rth moment of y and its variance can be written as 

£(/) = 07 + (1 - a)/i„ r = l,2,..., 
Var(y) = aVi + (1 - 0)^2 + a(l - a)(7 - /^)^ 

where is the rth moment of the beta distribution ([1]), Vi = 7(1—7) and V2 = V{iJ,)/{cj} + 1). 
Note that E(y'") is the weighted average of the rth moment of the Bernoulli distribution with 
parameter 7 and the corresponding moment of the i3(/x, (p) distribution with weights a and 
1 — a, respectively. 

Proposition 3.1. The zero-and-one-inflated beta distribution given in ([6]) is a four-parameter 
exponential family distribution of full rank. 

Proof Let r] = (r/i, 7?2, %, ??4) with rji = [log(a/(l - a)) - M(r?2) + 5(773,774)], 772 = 
log(7/(l — 7)), r/3 = fi(j) and 774 = (1 — fj,)(f) where M{rj2) = log(l + e'^^) and B(7]s,r]4) = 
log(r(773)r(774)/r(r?3 + 774)) and let T{y) = {ti{y)My)My)M{y)) with ti{y) = ]l{o,i}(y), 
t2(y) = y]l{o,i}(y), h{y) = log(y) if y e (0, l) and if y G {0, 1} and t4(y) = log(l - y) if 
y G (0, 1) and if y G {0, 1}. Note that the BEINF density function ([6]) can be written as 

e-K^Y>{v'T{y)-B*{r,)]h{y), (8) 



The probability measure P corresponding to BEINF(y; ■), defined over the measurable space ([0, 
where 23 is the class of all Borelian subsets of [0, 1], is such that P « X + So + Si, with A representing the 
Lebesgue measure and Sc is a point mass at c, i.e. Sc{A) = 1, if c € j4 and 0, if c ^ A, ^ £ *8. 
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where B*{ri) = log{l + exp[?7i + M(r/2) — -B(??3, rji)]} + B{r]3, r]^) is a real-valued function of r/ 
and h{y) = l/{y(l — y)} if y G (0, 1) and 1 if y € {0, 1}. The parameterization rj defines a one- 
to-one transformation which maps X = {(a, 7, fi, (p) : {a, 7, /i, (/>) G (0, 1) x (0, 1) x (0, 1) x iR+j 
onto D = ??3, ^4) : (^ii "^2, ??3, ?/4) £ M X IR X x M^}. Additionally, neither the 

t's nor the ry's satisfy linear constraints and the parameter space contains a four-dimensional 
rectangle. Therefore, ([8]) is the canonical representation of the BEINF distribution in the 
four-parameter exponential family of full rank. □ 

Let (yi, . . . , yn) be a random sample of a BEINF distribution. Proposition 13.11 implies 
that E"=i^(yt) = {Ti,T2,T3,T4), with Ti = ELi l'{o,i}(yt)' ^2 = ELi yt^{o,i}(yt)> ^3 = 
X]i:yt6(o.i) log(yt) s-iid r4 = X]t:yte(o,i) -Vt), is a Complete (minimal) sufficient statistic. 
The likelihood function for 9 = (a, 7, ^) given the sample (yi, . . . , y„) is 

n 

t=i 

with 

Li(a) = JJa^{o.i}(2^')(l - a)i-Vi}(y') = _ a)("-^i), 

L2(7)= n 7^'(l-7)(^-^')=7^^(l-7)(^^-^^\ 

t.yteio,!) 

The likelihood function factorizes in three terms, namely Li, L2 and L3; Li depends only 
on a, L2, only on 7 and L3, only on (^, (/>). Hence, a, 7 and (/U, 0) are separable parameters 
and maximum likelihood inference for a, 7 and (/i, i;^) can be performed independently. 
The log-likelihood function can be written as 

m = log(L(0)) = h{a) + £2(7) + ^3(/", 

where 

£i{a) = Ti log a + (n - Ti) log(l - a), 
^2(7) = T2 log7 + (Ti - T2) log(l - 7), 

£3(., ^) = (n - TO iog{ r(^^)r7(t-,)0) } + ^^(^^ - 1) 
+ r4((i-^)(/<-i). 

By differentiating ii{a) with respect to a, ^2(7) with respect to 7 and ^3(/u, with respect to 
IJ, and we obtain the score vector (Uaioe), ^^7(7), U^{ijl, cp), U^{iJi^ </>)), where Ua{a) = Ti/a — 
{n - Ti)/(1 - a), U^{j) = T2/7 - (Ti - T2)/(l - 7), ^^^C//,./-) = H{n - Ti)[^((l - ^)</.) - 
^i^l(f>)]+Ts-T^} andU4fi,(f>) = (n-ri)[V(</')-/i?/'(//0)-(l-/u)^((l-/u),/.)]+^r3-(l-/i)r4. 

It is easy to show that a = Ti/n and 7 = T2/T1 (0/0 being regarded as 0) are the ML 
estimators of a and 7, respectively. Here, a is the proportion of zeros and ones in the sample 
and 7 is the proportion of zeros among the observations that equal zero or one. Since S is a 
function of a complete sufficient statistic and is an unbiased estimator of a, it is the UMVUE 
of a; its variance is given by Var(S) = a(l — a) /n. The ML estimators of /i and (j) are obtained 
as the solution of the nonlinear system of equations {Uf^{fi, (l)),U(j,{fi, (j))) = 0. In practice. 
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ML estimates can be obtained through numerical maximization of the log-hkelihood function 
£3(/i, (p) using a nonhnear optimization algorithm. Closed- form estimators for /i and (p, Jl and 
(j) say, can be obtained using conditional moments of y given that y G (0,1), which do not 
depend neither on a nor on 7; see Section 2. Likewise, closed-form estimators of E(y'') and 
Var(?/) can be obtained by replacing a, 7, /i and (j) by S, 7, Jl and in ([7]). 

The Fisher information matrix for the parameters of the BEINF distribution can be 
written as 





KiO) 






V 



0^ 









where Kaa = l/{ct(l — ")}, 






= (1 



(9) 



a/{7(l - 7)}, '^A./. = (1 - a)<pHi;'ifi<P) + ^''((l - /x)0)}, 
fifi^ = = (1 - a)(p{'ilj'{n(p)n - '0'((1 - m)(/))(1 - /x)} and k.^^ = (1 - a){^'^'ilj'{iJ,(p) + (1 - 
/i)^'0'((l ~ ~ ''P'i'P)}- Here, a, 7 and (/U,(/)) are orthogonal parameters and, hence, the 
respective components of the score vector are uncorrelated. Since the zero-and-one inflated 
beta distribution ([6]) belongs to an exponential family of full rank (see Proposition (13. ip ). it 

follows that y/n{9 — 6) J\fi{'d,K{9)~^), with K{9) given in ([9]), and S, 7 and (/!,(/>) are 
asymptotically independent. 

The delta method (see Section 2) is useful for obtaining the asymptotic distribution of the 
ML estimator of any differentiable function r{9). For instance, if r{9) = E(y) = a7+(l — a)//, 

the variance of the normal limiting distribution of E(y) = r{9) is a^n'^'^ + (1 — a^K^^ + (7 ~ 



/i)^«;"". Here, 



1/Ko 



l/n-y-y and is the (3, 3)-element of K{9) given in 



There are other parameterizations of the BEINF distribution that can be useful. For 



example, let 7 
written as 



5i/a and a = 5q + 61. In this case, the BEINF density function can be 



m 



with /(y;/x. 



So, 

'{y;6o,6i,fi,(p) = < 61, 
representing the beta density ( 



if y = , 

if y = 1 , (10) 
-5i)/(y;/i,(/)), ifyG(0,l), 
T]). Here, the interpretation of the parameters 



is more intuitive, since 60 = P{y = 0), 5i = P{y = 1) and //, (p are the parameters of the 
beta distribution ([1]). However, this parameterization induces a restriction in the parameter 
space given by < 5o + "^i < 1- Fisher's information matrix for the BEINF distribution in 
this parameterization can be written as 



K{9) 





\ 













\ 



where 9 = 
I^SiSi = (1 



-5q)/5i{1-5o-5i), 



-- (1 - (5i)/(5o(l - 5o - (Jl), K^o^i 
At^^ = (l-a)02|^'(M</') + V''((l 



a)4){'ip' {^l(|))^l-'ip' {{I- n)(t)){l- n)] and ka>6 = (l-a){/i^V'(/^0) + (l 



>^5i5o = 1/(1 - 5o - 5i), 
i^ii<t> = f^cPfi = (1 - 
-M)>'((l-^)0)-V'(</')}. 



Here ksqSi 7^ 0, thus indicating that Sq and 61 are not orthogonal parameters, and their re- 
spective components in the score vector are correlated in contrast to the parameterization 
of the BEINF distribution given in ([6|) . Recently, the BEINF distribution in this parameter- 
ization was incorporated into the gamlss package in R (Stasinopoulos, Rigby & Akantziliotou, 
2006). 
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4 Simulation results and discussion 



We shall use Monte Carlo simulation to evaluate the finite sample performance of estimators 
based on maximum likelihood (ML) and conditional moments (CM) for the BEZI and BEINF 
distribution. For both distributions, the ML estimator of a is UMVUE. Hence, we do not 
show simulation results for the estimation of such parameter. We focus our attention in 
estimation of ^, (p, E(y) and Var(y); for the BEINF distribution, the estimation of 7 is also 
considered. The parameters of the BEZI distribution used in the numerical exercise were 
a = 0.2, fi = 0.1, and (/> = 2. For the BEINF distribution, a = 0.2, 7 = 0.3, /x = 0.1 and (j) = 2. 
The sample sizes considered were n = 10, 20, 50, 100, 500, 1000 and the number of Monte 
Carlo replications was 5,000. The ML estimates of /i and were obtained by maximizing 
the log-likelihood function using the BEGS method with analytical derivatives; the BEGS 
quasi-Newton method is generally regarded as the best-performing nonlinear optimization 
method (Mittelhammer, Judge and Miller, 2000, p. 199). All simulations were performed 
using the Ox matrix programming language (Doornik, 2006). 

Table [T] presents simulation results for the BEZI distribution. The estimated bias of the 
ML estimators of /x, E(y) and Var(?/) are close to zero for all the sample sizes considered. 
Also, the root mean square errors (\/MSE) of the ML and CM estimators of /x, E(y) and 
Var(y) are similar. However, in small samples, the ML and CM estimators of (p can be 
considerably biased, the CM estimator having much more pronounced bias than the ML 
estimator. Additionally, the mean and the root mean square error of (p is much larger than 
the corresponding figures obtained for (p. For instance, for n = 10, the biases and the root 
mean square error are, respectively, 3.4 and 10.5 for the ML estimator and 6.5 and 21.2 
for the CM estimator. It is noteworthy that, for all the sample sizes, the ML and CM 
estimators of (p have positive bias; however the variance of the response variable is only 
slightly underestimated. 

Table [2] summarizes simulation results for the BEINF distribution. The ML estimator of 
7 performs well if the sample size is not too small. The CM and ML estimators of /x are only 
slightly biased. On the other hand, for very small samples (eg. n = 10) the biases of the 
CM and ML estimators of E(?/) and Var(y) are not negligible. We observed however, that 
the ML estimator performs better than the CM estimator, both in terms of bias and mean 
square error. Again, the CM and ML estimators of (p are quite biased in small samples, the 
ML estimator performing better than the CM estimator. For instance, for n = 20 an (p = 2, 
the bias and the root mean square error are approximately 1.0 and 2.4 for the ML estimator 
and 1.7 and 3.9 for the CM estimator. 

In short, for the BEZI and the BEINF distributions the ML estimator of <p is more 
efficient than the CM estimator, both estimators being quite biased for very small samples 
(eg. n = 10). On the other hand, the ML and CM estimators of the other parameters, and 
of E(y) and Var(y), have similar performances, both being almost unbiased if the sample is 
not very small. 

5 Applications 

This section contains three applications of inflated beta distributions to real data. For the 
sake of comparison, we also fitted a Tobit model for each data set. Computation for fitting 
inflated beta and Tobit models was carried out using the packages gamlss and VGAM in the R 
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Table 1: Simulation results for the BEZI distribution; a = 0.2, = 0.1, = 2.0, E(y) = 0.08 
and Var(y) = 0.0256. 





Mean 


Bias 


VMSE 


Par. 


Ti 


CM 


ML 


CM 


ML 


CM 


ML 




1 D 




0945 


0009 


—0 0055 


0637 


0591 






yj . i.\j\jo 


0971 


0005 


—0 0099 


0439 


0406 






1004 


0989 


0004 


—0 001 1 


0976 

\J.\JL 1 U 


0969 

\j .yj L\j L 




100 


0999 


0992 


—0.0001 


—0 0008 


0.0194 


0.0186 




500 


0998 


0996 


—0 0002 


—0.0004 


0086 


0083 




1000 


0.0997 


0.0997 


—0 0003 


—0 0003 


0.0061 


0058 




1 n 


8 5145 


5 3980 


6 5145 


3 3980 


91 1880 


1 4990 




9(1 


O .\JiJ\J\J 


9 9660 


1 6960 


9660 


4 6498 


9 8064 


ih 


50 


9 4009 


9 9750 


4009 


9750 


1 1 773 


8370 




100 


2.1840 


2.1371 


0.1840 


0.1371 


0.6793 


0.5036 




500 


2.0340 


2 0292 


0.0340 


0292 


0.2487 


0.1959 




1000 


2.0150 


2.0133 


0.0150 


0.0133 


0.1700 


0.1342 




10 


0.0785 


0.0739 


—0.0015 


—0.0061 


0503 


0.0471 




90 


0802 


0775 


0009 

\J •\J\J\JL 


—0 0025 


0354 


0334 




50 


0804 


0799 


0004 


—0 0008 


0998 


091 8 




100 


0.0799 


0.0794 


-0.0001 


-0.0006 


0.0161 


0.0155 




500 


0.0798 


0.0797 


-0.0002 


-0.0003 


0.0072 


0.0069 




1000 


0.0800 


0.0799 


0.0000 


-0.0001 


0.0050 


0.0048 




10 


0.0229 


0.0228 


-0.0027 


-0.0028 


0.0229 


0.0203 




20 


0.0244 


0.0242 


-0.0012 


-0.0014 


0.0167 


0.0153 


Var(y) 


50 


0.0253 


0.0252 


-0.0003 


-0.0004 


0.0109 


0.0104 




100 


0.0254 


0.0253 


-0.0002 


-0.0003 


0.0078 


0.0075 




500 


0.0255 


0.0255 


-0.0001 


-0.0001 


0.0035 


0.0034 




1000 


0.0256 


0.0255 


-0.0000 


-0.0001 


0.0024 


0.0024 



software package (Ihaka & Gentleman, 1996), respectively. In gamlss, we used the BEZI 
and the BEINF distributions implemented by Ospina (2006) and Stasinopoulos, Rigby k. 
Akantziliotou (2006), respectively. 

The first application uses a data set of Brazilian indicators of qualified priority services 
in 2000. The data were extracted from the Atlas of Brazil Human Development database 
available at http://www.pnud.org.br/, We modeled the percentage of qualified nurses in 
645 Brazilian municipal districts. The data set has zeros; some municipal districts with 
high levels of poverty do not have qualified nurses. The frequency histogram of the data 
is presented in Figure [2l It has an inverted 'J' shape, a characteristic easily modeled by 
the BEZI distribution. The vertical bar at zero represents the total number of zeros in the 
sample. We also considered a left censored Tobit model, by assuming that yt = i/f, if > 0, 
and yt = 0, if y^ < 0, where y^ ~ J\f{fJ.,a'^) are independent random variables. The ML 
estimates (standard errors in parentheses) for the parameters of the BEZI distribution are 
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Table 2: Simulation results for the BEINF distribution; a = 0.2,7 = 0.3,// = 0.1, (j) = 2.0, 
E(2/) = 0.14 and Var(y) = 0.0724. 





Mean 


Bias 


VMSE 


Par. 
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CM 


ML 


CM 


ML 


CM 


ML 




10 




0.4462 




0.1462 




0.1348 




20 




3882 




0882 




0.1616 


1 


50 




0.3102 




0.0102 




0.1374 




100 




0.3015 




0.0015 




0.1047 




500 




0.2995 




-0.0005 




0.0466 




1000 




0.3011 




0.0011 




0.0323 




10 


0.1017 


0955 


0.0017 


-0.0045 


0.0661 


0.0607 




20 


0.1003 


0969 


0003 


-0.0031 


0.0445 


0.0418 


H' 


50 


0.1006 


0992 


0006 


-0 0008 

\J • \J KJ \J Ly 


0.0278 


0265 




100 


0.0999 


0.0993 


-0.0001 


-0.0007 


0.0195 


0.0186 




500 


0.1002 


0.1000 


0.0002 


0.0000 


0.0086 


0.0083 




1000 


0.1001 


0.0999 


0.0001 


-0.0001 


0.0062 


0.0060 




10 


10.9191 


6 6869 


8.9191 


4.6869 


33 3939 


19.5225 




20 


3.7055 


2 9909 


1.7055 


9909 


3.8645 


2 3889 




50 


2.4123 


2.2821 


0.4123 


0.2821 


1.1997 


8229 




100 


2.1824 


2.1380 


0.1824 


0.1380 


0.6407 


0.4891 




500 


2.0321 


2.0210 


0.0321 


0.0210 


0.2424 


0.1894 




1000 


2.0187 


2.0115 


0.0187 


0.0115 


0.1726 


0.1363 




10 


0.2002 


0.1956 


0602 


0556 


0.0679 


0655 




20 


0.1627 


0.1601 


0.0227 


0.0201 


0.0524 


0.0512 


E(v) 


50 


0.1423 


0.1411 


0023 


0.0011 


0363 


0.0357 




100 


0.1401 


0.1396 


0.0001 


-0.0004 


0.0266 


0.0263 




500 


0.1400 


0.1398 


0.0000 


-0.0002 


0.0121 


0.0119 




1000 


0.1402 


0.1401 


0.0002 


0.0001 


0.0085 


0.0084 




10 


0.1129 


0.1139 


0.0405 


0.0415 


0.0313 


0.0306 




20 


0.0871 


0.0873 


0.0147 


0.0149 


0.0297 


0.0293 


Var(y) 


50 


0.0727 


0.0727 


0.0003 


0.0003 


0.0232 


0.0229 




100 


0.0717 


0.0717 


-0.0007 


-0.0006 


0.0177 


0.0176 




500 


0.0721 


0.0721 


-0.0003 


-0.0003 


0.0080 


0.0080 




1000 


0.0724 


0.0724 


0.0000 


0.0000 


0.0057 


0.0056 



a = 0.0155 (0.0049), /I = 0.1263 (0.0042), and 4> = 4.691 (0.220), and for the Tobit model, 
fi = 0.1177 (0.0060) and a = 0.1433 (0.0040). The plot of the empirical distribution function 
of the data along with the estimated cumulative distribution functions (see Figure 2) shows 
that only the BEZI distribution fits the data well. 

In the next application, we consider 5561 observations of proportions of less than one 
year old infants that died by unknown causes in Brazilian municipal districts in 2000. The 
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Figure 2: Frequency histogram and estimated cumulative distribution functions for the per- 
centage of quahfied nurses in Brazilian municipal districts. 

data were obtained from the DATASUS database available at www.datasus.gov.br. The 
data set contains 3364 zeros and 172 ones; the frequency histogram of the data is presented 
in Figure El For this data set we used a BEINF distribution under the parameterization 
{6o, Si, fj,, (j)). Also, we fitted a doubly censored Tobit model, i.e we assumed that yt = yl, if 
Q < yl < 1, yt = 0, if < and yt = 1, if yt > 1, where yt ~ AA(^,o"^) are independent 
random variables. We obtained the following ML estimates: 6o = 0.6055 (0.0066), 6i = 
0.0313 (0.0023), /I = 0.2974 (0.0043) and = 0.4562 (0.0050) for the BEINF distribution, 
and fl = -0.1555 (0.0088) and a = 0.5420 (0.0085) for the Tobit model. The empirical 
distribution and the BEINF and Tobit fitted cumulative distributions are shown in Figure 
[3l By visual inspection, it becomes clear that only the BEINF distribution is a suitable 
theoretical model to the data at hand. 

Finally, we modeled the proportion of inhabitants who lived within a 200 km wide coastal 
strip in 223 countries in the year 2002. The data are supplied by the Center for International 
Earth Science Information Network and are available at http : / / sedac . ciesin . Col umbia .j 
edu/plue/nagd/place . Figure H] shows that the histogram has a 'J7' shape. For these 
data we fitted a BEINF distribution under the parameterization (5o,5i, and a doubly 
censored Tobit model. The ML estimates for the parameters of the BEINF distribution are 
6o = 0.1141 (0.0215), 6i = 0.4064 (0.0332), = 0.6189 (0.0279) and 4> = 0.6615 (0.0204). 
For the Tobit model, we obtained the following estimates: /2 = 0.8766 (0.0518) and a = 
0.6975 (0.0368). Figure [4] shows the empirical distribution and the estimated cumulative 
distribution curves. Clearly, the Tobit model does not fit the data well. On the other hand, 
the empirical distribution and the BEINF estimated cumulative distribution curves are quite 
close and we may conclude that the BEINF distribution is suitable to model the data. 
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Figure 3: Frequency histogram and estimated cumulative distribution functions for the pro- 
portion of less than one year old infants that died by unknown causes in Brazilian municipal 
districts in 2000. 
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Figure 4: Frequency histogram and estimated cumulative distribution functions; coastal prox- 
imity data. 
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6 Concluding remarks 



The beta distribution is useful to model data that are measured continuously on the open 
interval (0, 1). However, data sets that contain zeros and/or ones cannot be modeled using a 
beta distribution. A possible solution is to transform the response variable so that it assumes 
values on the open unit interval. However, the use of transformations modifies the real nature 
of the data and does not allow the direct interpretation of the parameters in terms of the 
original response. 

In this paper, we propose mixed continuous-discrete distributions to model data that are 
observed on [0, 1), (0, 1] or [0, 1]. The proposed distributions are "inflated beta distributions" 
in the sense that the probability mass at and/or 1 exceeds what is expected by the beta 
distribution. Properties of the inflated beta distributions are given. Also, estimation based 
on maximum likelihood and conditional moments is discussed and compared using Monte 
Carlo simulation. Overall, we recommend maximum likelihood estimation as the best choice. 

An alternative to the inflated beta distributions is to assume that a latent variable on (0, 1) 
gives rise to an observed response in [0, 1]. This approach has been suggested by Lesaffre, 
Rizopoulos and Tsonaka (2007). They assume that the tth observation is yt = rt/Nt, where 
rt ~ Bm{Ut, Nf) and the UfS follow a logit-normal distribution. However, if the A'^'s are 
not known, as is the case of the examples presented in Section 5, this model cannot be used. 
They also consider the situation where the response is assumed to be a coarsened version 
of a latent variable with a logit-normal distribution on (0, 1). In other words, it is assumed 
that the logit transformed latent variable has a normal distribution. The model we propose 
requires neither transformations nor the inclusion of a latent variable. Tobit models, on the 
other hand, do not require transformations but use a latent normal distributed variable. The 
assumed normality of the latent variable does not allow the Tobit models to be as flexible as 
the inflated beta distributions to model fractional data. Additionally, the interpretation of 
the parameters of Tobit models is rather difficult. For instance, the mean of double censored 
Tobit responses involves the cumulative distribution function and the probability density 
function of a standard normal distribution; see Hoff (2007, Section 4). 

Three empirical applications using real data show that the inflated beta distributions are 
quite flexible for modeling fractional data on the closed or half-open unit interval. Also, for 
our data sets the Tobit models did not work well0. 

We suggest that practitioners interested in modeling the behaviour of variables that as- 
sume values in the unit interval consider using a suitable inflated beta distribution whenever 
zeros and/or ones appear in the data set. 
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